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Planned wide-field weak tensing surveys are expected to reduce the statistical errors on the shear field to unprecedented levels. In 
contrast, systematic errors like those induced by the convolution with the point spread function (PSF) will not benefit from that 
scaling eff'ect and will require very accurate modeling and correction. While numerous methods have been devised to carry out the 
PSF correction itself, modeling of the PSF shape and its spatial variations across the instrument field of view has, so far, attracted 
much less attention. This step is nevertheless crucial because the PSF is only known at star positions while the correction has to be 
performed at any position on the sky. A reliable interpolation scheme is therefore mandatory and a popular approach has been to use 
low-order bivariate polynomials. In the present paper, we evaluate four other classical spatial interpolation methods based on splines 
(B-splines), inverse distance weighting (IDW), radial basis functions (RBF) and ordinary Kriging (OK). These methods are tested 
on the Star-challenge part of the GRavitational lEnsing Accuracy Testing 2010 (GREAT 10) simulated data and are compared with 
the classical polynomial fitting (Polyfit). In all our methods we model the PSF using a single Moff'at profile and we interpolate the 
fitted parameters at a set of required positions. This allowed us to win the Star-challenge of GREAT 10, with the B-splines method. 
However, we also test all our interpolation methods independently of the way the PSF is modeled, by interpolating the GREAT 10 
star fields themselves (i.e., the PSF parameters are known exactly at star positions). We find in that case RBF to be the clear winner, 
closely followed by the other local methods, IDW and OK. The global methods, Polyfit and B-splines, are largely behind, especially 
in fields with (ground-based) turbulent PSFs. In fields with non-turbulent PSFs, all interpolators reach a variance on PSF systematics 
crj better than the 1 x 10"'' upper bound expected by future space-based surveys, with the local interpolators performing better than 



the global ones. 
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1. Introduction 



Methods: data analysis 



The convolution of galaxy images with a Point Spread Function 
(PSF) is among the primary sources of systematic error in weak 
tensing measurement. The isotropic part of the PSF kernel makes 
the galaxy shape appear rounder, while the anisotropic part in- 
troduces an artificial shear eflfect that may be confused with the 
genuine shear tensing signal. 

To tackle these issues, various PSF correction methods have 



been proposed (|Kaiser et al. 


11995 


|Luppino & Kaiser 


1997; 
2002^ 


Hoekstra et al.||1998[ |Kaiser 


2000 


IBernstein & Jarvis, 


Hirata & Seljak 2003| Refregier & Bacon 2003) and 


some 



of them implemented as part of shear measurement pipelines 
( [Heymans et al. 2006; Massey et al. 2007 ; Bridle et al. 2010). 
However, these correction schemes do not have built-in solutions 
for addressing another problem: the spatial variation of the PSF 
across the instrument field of view that may arise, for instance, 
from imperfect telescope guidance, optical aberrations or atmo- 
spheric distortions. 

A non-constant PSF field implies the PSF is no longer ac- 
curately known at galaxy positions and must then be estimated 
for the accurate shape measurement of galaxies. Bivariate poly- 
nomials, typically used as interpolators for this purpose, have in 
several cases been found unable to reproduce sparse, multi-scale 
or quickly varying PSF anisotropy patterns ([Hoekstra] |2004[ 



Jarvis & Jain|20Q4t|Van Waerbeke et al.|2002||2005t|Jee & Tyson 

20TT] r 

This raises the question of whether there exists alternative 
PSF models and interpolation schemes better suited for PSF es- 



timation than those used so far. Indeed, it seems important to 
improve this particular aspect of PSF modeling in the perspec- 
tive of future space-based missions such as Euclid or advanced 
ground-based telescopes like the LSST ( Jee & Tyson|2011 ). 

Only recently has the PSF variation problem begun to be 
taken seriously with, notably, the advent of the GRavitational 
lEnsing Accuracy Testing 2010 ( GREAT 10) Star Challenge, on e 
of the two GREATIO challenges ( [Kitching et al.|2011|[2QT2b ). 
The Star Challenge images have been designed to simulate a 
variety of typical position-varying PSF anisotropy patterns and 
competing PSF interpolation methods were judged on their abil- 
ity to reconstruct the true PSF field at asked, non-star positions. 

The Star Challenge gave us the opportunity to evaluate a 
number of alternative schemes suitable for the interpolation of 
realistic, spatially- varying PSF fields. The objective of this pa- 
per is twofold: (1) to describe our approach for tackling the prob- 
lems raised by the Star Challenge and to discuss our results; (2) 
to perform a comparative analysis of the diflTerent interpolation 
methods after applying them on the Star Challenge simulations. 

Our paper is thus organized as follows. We begin by review- 
ing the most commonly used PSF representation and interpola- 
tion schemes in Sect.[2]and continue with a overview of the inter- 
polation schemes mentioned above in Sect. [3] We then describe 
our PSF estimation pipeline and analyze our results in Sects. |4] 
and [5] Lastly, in Sect.[6j we measure the respective accuracy of 
all methods based on the solutions made available after comple- 
tion of the challenge and discuss the merits of each method. We 
conclude in Sect. [T] 
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2. An overview of existing PSF interpolation 
schemes 

Before correcting galaxies in the image for a spatially- varying 
PSF field, every shear measurement pipeline has, in one way or 
another, to interpolate the PSF between the stars, as illustrated 
in Fig. [T] The way this is best achieved depends essentially on 
the PSF model used and on the PSF interpolation algorithm. The 
PSF model defines which features of the PSF are to be repre- 
sented, which also determines on which quantities spatial inter- 
polation is performed. The role of the interpolation scheme, on 
the other hand, is to apply a prediction algorithm to find the best 
estimates for those quantities. 

In the KSB metho d Kaiser et al. ( 1995 ) and its KSB-h variant 
( Luppino & Kaiser|1997^,Hoekstra et al.,199 8 ), the relevant fea- 
tures of the PSF model are its ellipticity and size, which are es- 
timated from the second-order geometrical moments of the PSF 
image. The main idea behind the PSF correction scheme is that 
the PSF distortion on a galaxy image can be well described by a 
small but highly anisotropic kernel convolved with a large, cir- 
cular seeing disk. To find the appropriate q for galaxies, the val- 
ues of q* at star positions (and sometimes the so-called "smear" 
and "shear" polarization tensors P^^* and p^^*) are interpolated 
across the image. For doing so, the typical procedure is to fit a 
second or third-order bivariate polynomial function. 

Exactly which quantity is interpolated and which order is 
used for the polynomial depends on the KSB-h imple mentations. 
See e. g. Heymans et al. (2006), Appendix [A| ,Massey et al. 
(|2007|) and recently published studies using KSB-F ([Hoekstra 
e^al.''l998! [Clowe & Schneider" "2002; "He ymans e t al."2005! 



Hetterscheidt et al... 
eTaLjlOOSllUmetsu et al.|2010| ). 



[2007; Paulin-Henriksson et al.j^OOT^ 



Fu 



A model representing a PSF as only a size and first-order de- 
viation from circularity certainly appears quite restrictive. One 
can instead look for an extensive, but compact description of the 
PSF image, better suited to operations like noise filtering or de- 
convolution. A natural approach is to characterize the full PSF as 
a compact, complete set of orthogonal basis functions provided 
in analytical form, each basis being associated with a particular 
feature of the image (shape, frequency range, etc.). Ideally, this 
would not only simplify galaxy deconvolution from the PSF but 
also allow to better model the spatial variation of the PSF across 
the field of view. 

'Bernstein & Jarvis (2002 ) and Refregier (2003); Refregier & 
[Bacon^(2003j ; Massey & Refregier (2005) have proposed PSF 
expansions based on the eigenfunctions of the two-dimensional 
quantum harmonic oscillator, expressed in terms of Gauss- 




Fig. 1. Interpolating a spatially- varying PSF field. The illustrated 
field is a subset of an actual GREATIO Star Challenge PSF field. 



Laguerre orthogonal polynomials ( [Abramowitz & Stegun 1965 ). 
These functions can be interpreted as perturbations around a 
circular or elliptical Gaussian. The eff'ect of a given operation 
(such as shear or convolution), on an image can then be traced 
through its contribution on each coefficient in the basis func- 
tion expansion. For instance, the second-order /2,2 coeflftcient 
of a Shapelet is the ellipticity estimator based on the Gaussian- 
weighted quadrupole moments used in KSB. 

Modeling the PSF variation patterns with Shapelets typi- 
cally involves the following steps: stars are expanded in terms 
of Shapelet basis functions and the expansion coeflftcients for 
each of the basis functions are fitted with a third or fourth-order 
polynomial. The interpolated values of the Shape let coefficients 
are then used to reconstruct the PSF at galaxy positions. 

This scheme has been successfully applied to several weak 
lensing cluste r studies ([Tee e t al. 2005a b, 2006| |2007b[ |Berge| 
|et al. 2008; R omano et al.||2010 ). However, it has been argued 
( |Jee et al. 2007a[ Melchior et al.j,2010) that even a high-order 
Shapelet-based PSF model is unable to reproduce extended PSF 
features (such as its wings) and that the flexibility of the model 
makes it vulnerable to pixelation and noise. So, although the 
level of residual errors after Shapelets decomposition appears 
low enough for cluster analysis, it may prove too high for 
precision cosmic shear measurement. 

Actually, it is not clear if there exists any set of basis 
functions expressed in analytical form that is capable of accu- 
rately describing all the signal frequencies contained in the PSF. 
An alternative approach is to decompose the PSF in terms of 
basis functions directly derived from the data through Principal 
Component Analysi s (PCA), as pioneered by Lauer (2002), 
Lupton et al. ( 2001| ). This approach is supposed to yield a set 
of basis function, the so-called "Principal Components", opti- 
mized for a particular data configuration and sorted according to 
how much they contribute to the description of the data. 

In practice, two main procedures have been experimented 
that essentially depend o n the type data where PCA is applied. 
Jarvis & Jaml (|2004) and Schrabback et al." (2010) fit selected 
components of the PSF (e.g. ellipticity or KSB anisotropy ker- 
nel) across all image exposures with a two-dimensional polyno- 
mial of order 3 or 4. PCA analysis is performed on the coeflft- 
cients of the polynomial, which allows the large-scale variations 
of the PSF in each exposure to be expressed as a weighted sum 
of a small set of principal components. A further, higher-order 
polynomial fit is then conducted on each exposure to capture 
more detailed features of the PSF. 

On the othe r hand , and more recently, |Jee et al.| ( |2007a| ), 
Nakajima et al. (2009) and Jee & T yson (2011 ) experimented 
a diff'erent procedure for modeling the variation of the Hubble 
Space telescope (HST) ACS camera and a simulated Large 
Synoptic Survey Telescope (LSST) PSF. Instead of applying 
PCA on polynomial coefficients, they perform a PCA decom- 
position on the star images themselves into a basis made of the 
most discriminating star building blocks. Each star can then be 
expanded in terms of these "eigenPSFs" and the spatial varia- 
tion of their coefficients in that basis is modeled with a bivariate 
polynomial. 

Regardless of the procedure used, the PCA scheme proves 
superior to wavelets and Shapelet for reproducing smaller- scale 
features in the PSF variation pattern, thanks to improved PSF 
m odeling and the use o f higher-order polynomials. In the case 
of [Jarv is & Jain ( 2004| ), applying PCA across all exposures al- 
lowed to compensate for the small number of stars available per 
exposure. Moreover, PCA is not tied to any specific PSF model. 
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It should be noted, however, that at least two factors 
may limit the performance of PCA in practical weak lensing 
applications: the first is that the PCA algorithm is only able 
to capture linear relationships in the data and thus may fail to 
reproduce some types of high-frequency variation patterns; the 
other is that PCA misses the components of the PSF pattern 
that are random and uncorrected, such as those arising from 
atmospheric turbulence. How serious these limitations prove 
to be and h ow they can be overcome need to be i nvestigated 
further (e.g. jJain et al.|2Q06t[Schrabback et al.(2QT0l ). 



All the above methods attempt to model PSF variation pat- 
terns in an empirical way by the application of some mathemat- 
ical formalism. It may, on the contrary be more beneficial to un- 
derstand which physical phenomena determine the structure of 
the PSF patterns and, once done, seek appropriates models for 
reproducing them ( |Jee et al.|2Q07a ; Stabena u et al.|2007t |Jee &| 
|Tyson|2Qll| ). The PSF of the HST ACS camera, for instance, has 
been studied extensively and in some cases, the phy sical origin 
of some of the patterns clearly identified. Jee et al. (2007a) and 
[Jee & Tyson (2011) could relate the primary principal compo- 
nent to changes in telescope focus causes by constraints on the 
secondary mirror supporting structure and the "thermal breath- 
ing" of the telescope. 

In fact, various combined eff'ects make the PSF vary spatially 
or over time. Some patterns are linked to the behavior of the op- 
tical system of the telescope or the detectors. Others are related 
to mechanical or thermal eff'ects that make the telescope move 
slightly during an observation. For ground-based instruments, 
refraction in the atmosphere and turbulence induce further PSF 
distortion. 

Incorporating such a wide diversity of eff'ects into a PSF 
variation model is not an easy task. However, according to 



Jarvis et al. (2008), models of low-order optical aberrations 
such as focus and astigmatism can reproduce 90% of the 
PSF anisotropy patterns found in real observation data. If so, 
physically-motivated models could provide an alternative or a 
complement to purely empirical methods such as PCA. 

3. Looking for better PSF interpolation schemes 

The analysis of commonly used PSF interpolation schemes in 
the previous section has shown that the range of PSF interpo- 
lation algorithms is actually quite restricted: almost always the 
quantities used to characterize the PSF are fitted using a bivariate 
polynomial function. 

But it is important to acknowledge there may exist alternative 
interpolation schemes that would prove more eff'ective for that 
purpose than polynomial fitting. Beyond this, it is essential to 
recognize the goal here is not to only interpolate changes in the 
PSF but also to perfo rm a spatial interp olation of such changes. 

Interpolation (e.g.<Press et al.|2007| ) is commonly understood 
as the process of estimating of values at location where no sam- 
ple is available, based on values measured at sample locations. 
Spatial interpolation diff'ers from regular interpolation in that it 
can take into account and potentially exploit spatial relationships 
in the data. In particular, it is often the case that points close to- 
gether in space are more likely to be similar than points further 
apart. In other words, points may be spatially autocorrelated, at 
least locally. Most spatial interpolation methods attempt to make 
use of such information to improve their estimates. 

After a critical review of polynomial fitting, we consider and 
discuss alternative spatial interpolation schemes for modeling 
PSF variation patterns. 



3.1. A critical view of polynomial fitting 

In the context of spatial interpolation, fitting polynomial func- 
tions of the spatial coordinate x = (xi,yi) to the sampled z(x) 
values of interest by ordinary least squares regression (OLS) is 
known as "Trend Surface Analysis" (TSA). The fitting process 
thus consists in minimizing the sum of squares for (z(x) - z(x)), 
assuming the data can be modeled as a surface of the form 



r 



(1) 



r+s<p 



The integer p is the order of the trend surface (and the order 
of the underlying polynomial). Finding the bi coeflftcients is a 
standard problem in multiple regression and can be computed 
with standard statistical packages. 

In the literature reviewed from the previous section, authors 
often justify their choice of polynomial fitting by arguing the 
PSF varies in a smooth manner over the image. Indeed trend 
surfaces are well suited to modeling broad features in the data 
with a smooth polynomial surface, commonly of order 2 or 3. 

However, PSF variation patterns result from a variety of 
physical eff'ects and even though polynomials may adequately 
reproduce the smoothest variations, there may exist several other 
types of patterns that a low-order polynomial function cannot 
capture. Polynomials are also quite poor at handling disconti- 
nuities or abrupt changes in the data. This concerns particularly 
sharp discontinuities across chip gaps and rapid changes often 
found near the corners of the image. 

An illustrative example of the shortcomings just described 
was the detection of a suspicious non-zero B- mode cosmic shear 
signal in the VIRMQS -PES CA RT survey (.Van Waerbeke et al. 
|2002| ). After investigation poekstra|2004[ | Van Waerbeke 



2001 



et al. 



2005), the small scale component of the signal was traced 



to the PSF interpolation scheme: the second-order polynomial 
function was unable to reproduce the rapid change in PSF 
anisotropy at the edges of the images. In fact, one of the main 
limitation of polynomials when used for interpolating PSF im- 
ages in weak lensing studies lie in their inability to reproduce 
variations on scales smaller than the typical inter-stellar distance 
on the plane of the sky (often < 1 arcmin at high galactic lati- 
tude). 

Unfortunately there are no satisfactory solutions to these 
shortcomings. Increasing the order of the polynomial function 
does not help as it leads to oscillations while attempting to cap- 
ture smaller- scale or rapidly- varying features. The z(x) values 
may reach extremely (and unnaturally) small or large values near 
the edge or just outside the area covered by the data. Such ex- 
treme values can also create problems in calculations. 

One way to alleviate such problems is to pack more densely 
the interpolating points closer to the boundaries, but this may 
not be easy to a chieve in practice. Hoekstra (2004) and Van 
[Waerbek e et al.| p005 ) also obtained good results with an in- 
terpolator made of a polynomial function to model large-scale 
changes combined with a rational function to deal with small- 
scale variations. It is not clear, however, if this scheme can be 
safely applied on diff'erent data and this may require a signifi- 
cant amount of fine tuning. 

In addition to the issues just mentioned, local eff'ects in one 
part of the image may influence the fit of the whole regression 
surface, which makes trend surfaces very sensitive to outliers, 
clustering efl'ects or observable errors in the z(x). Finally, OLS 
regression implicitly assumes the z(x) are normally distributed 
with uncorrected residuals. These assumptions do not hold true 
when spatial dependence exists in the data. 
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Table 1. Least squares polynomial fitting / Trend surface: Pros 
and cons 

Least squares polynomial Fitting 



Table 2. Spatial interpolation methods reviewed in this article 



Pros 



Simple and intuitive 
Fast to compute 



Cons 



Usually only able to capture broad features (underfitting) 

Increasing the order of polynomials does not improve and 
generally degrades accuracy (overfitting) 

High-order polynomials generate numerical issues (round- 
ing errors, overflow, etc.) 
High sensitivity to outliers and fitting errors 
Local changes propagate to the whole polynomial surface 
No estimation of interpolation errors (deterministic) 



Actually, the fact that trend surfaces tend to ignore any spa- 
tial correlation or small-scale variations can turn into an advan- 
tage to remove broad features of the data prior to using some 
other, finer-grained interpolator. Indeed, we saw in section §1 
that |Jarvis & Jain| ( [2004| ) took advantage of this feature in their 
PCA-based interpolation scheme. 

Most of the aforementioned limitations are rooted in the 
use of standard polynomials. One possible way out is to aban- 
don trend surfaces altogether and use piecewise pol ynomials 
instead (especially Chebyshe v polynomials), splines ( |de Boor 
[T9781 |Dierckx|[T9g5| [Schumak er 2007; Prenter 2008) or alter- 
native schemes that do not involve polynomials. Table [T] recalls 
the main advantages and disadvantages of polynomial fitting. 



3.2. Toward alternative PSF interpolation methods 

Having pointed out some important shortcomings of polynomial 
regression, it seems legitimate to look for alternative classes of 
interpolators. It is however, probably illusory to look for an ideal 
interpolation scheme that can describe equally well any kind 
of PSF variation structure. For instance the patterns of varia- 
tion in a turbulent PSF are very diff'erent from those found in 
a diffraction-limited PSF. It is therefore probably more useful to 
identify which class of interpolators should be preferably used 
for a particular type of PSF pattern. 

It is also key to realize that one does not need to reconstruct 
the entire PSF field: one only has to infer the PSF at specific 
galaxy positions based on its knowledge at sample star positions. 
This implies that the class of interpolation schemes applicable 
to the PSF variation problem is not restricted to surface fitting 
algorithms such as polynomial fitting, but also encompasses in- 
terpolation algorithms acting on scattered data. 

Such data may also be considered as a partial realization 
of some stochastic process. In such case, it becomes possible 
to quantify the uncertainty associated with interpolated values 
and the corresponding interpolation method is referred to as a 
method for spatial prediction. In this article we will neglect this 
distinction and use the generic term "spatial interpolation". 

In fact, there are quite a few interpolation schemes that can 
be applied to model PSF changes. Over the years a large num- 
ber of interpolation methods have been developed in many dis- 
ciplines and with various objectives in mind. Spatial interpola- 
tors are usually classified according to their range (local versus 
global), the amount of smoothing (exact versus approximate) 
and whether they consider the data as a realization of some ran- 
dom process (stochastic versus deterministic). 



Interpolation method 


Scope 


Exactness 


Model 


Polynomial fitting 


global 


approximate 


deterministic 


Basis splines 


global 


approximate^ 


deterministic 


Inverse distance weighting 


local 


exact^ 


deterministic 


Radial basis function 


local 


exact^ 


deterministic 


Ordinary Kriging 


local 


exact^ 


stochastic 



Notes. Can be made exact by disabling smoothing Smoothing 
possible with specific algorithms Some Kriging algorithms are ap- 
proximate 

A global method makes use of all available observations 
in the region of interest (e.g. the image of a whole portion 
of the sky) to derive the estimated value at the target point 
whereas a local method only considers observations found 
within some small neighborhood around the target point. Thus, 
global methods may be preferable to capture the general trend 
in the data, whereas local methods may better capture the local 
or short- range variations and exploit s patial relationships in 
the data ( Burrough & McDonnell| 1998 ). A trend surface is an 
example of global estimator. 

An interpolation methods that produces an estimate that is 
the same as the observed value at a sampled point is called an 
exact method. On the contrary a method is approximate if its 
predicted value at the point diff'ers from its known value: some 
amount of smoothing is involved for avoiding sharp peaks or 
troughs in the resulting fitted surface. 

Lastly, a stochastic (also called geo statistical) interpolator 
incorporates the concept of randomness and yields both an 
estimated value (the deterministic part) and an associated error 
(the stochastic part, e.g. an estimated variance). On the other 
hand, a deterministic method does not provide any assessment 
of the error made on the interpolated value. 

Table |2] contains the list of spatial interpolation methods 
covered in this article along with their classification. 



Nearly all methods of spatial interpolation share the follow- 
ing general spatial prediction formula 



(2) 



where xq is a target point where the value should be estimated, 
the z(x/) are the locations where an observation is available and 
the Xi are the weights assigned to individual observations. 
represents the number of points involved in the estimation (See 
Fig. [2] for an illustration). Each interpolation method has its own 
algorithm for estimating the weights At. All the interpolation 
methods evaluated in this article except splines, follow Eq. ([2]). 

We now review several widely used interpolation schemes 
that can be applied to the PSF interpolation problem: polynomial 
splines, inverse distance weighting (IDW), radial basis functions 
(RBF) and Kriging. In the remaining sections, we test these in- 
terpolation methods using the GREAT 10 Star Challenge simu- 
lated data. 



3.3. Spline interpolation 

A (polynomial) univariate spline or degree p (order p -\- 1) is 
made of a set of polynomial pieces, joined together such that 
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Fig. 2. An illustration of local interpolation between a set of 
neighboring observations Z(x/) at distances di from a target lo- 
cation xq. In this example, a set of weights Ai is assigned to each 
of the Z(x/), as in Eq.[2] 

pieces and their derivatives at junction points (knots) are contin- 
uous up to degree p- \ ( |de B oor| 1 97 8 [ |Dierckx| 1995 [ |Schumaker| 
t20Q7 i Prent er 2008 ). 

When it comes to modeling two-dimensional spatially vary- 
ing PSF attributes across an image, we are more specifically in- 
terested in bivariate polynomial splines. A function s{x,y) de- 
fined on a domain [a, b\ x [c, d\ with respective, strictly increas- 
ing knot sequences /I/, / = 0, 1..., g + I {X^ = a, Xg+i - b) in the 
X direction and fij, j = 0, 1..., h -\- I (juq = c,/Uh+i = d) in the y 
direction is called a bivariate (tensor product) spline function of 
degree ^ > (order k+l)'mx and / > (order / -h 1) in _y if the 
following two conditions are satisfied: 

1. On each subregion D/j = x s{x,y) is 
given by a polynomial of degree k'mx and / in y 

s{x,y) ePk^Pi i = 0, 1, g; j = 0, 1, ...h 

2. The function s(x,y) and all its partial derivatives are contin- 
uous on Dij 



&^h{x,y) 
dx^ dyj 



e C(Dij) i = 0, 1, k-Uj = 0,1, / - 1 



We saw earlier that polynomial fitting suff'ers in particular from 
two serious drawbacks. One of these is that individual obser- 
vations can exert an influence, in unexpected ways, on difl'erent 
parts of the interpolating surface. The other is the tendency of the 
interpolation surface to wiggle without control as soon as one in- 
creases the degree of the polynomial to try to obtain a closer fit. 

Polynomial splines solve these problems in two ways. First, a 
spline is not made of a single "global" polynomial but of a set of 
"local" polynomial pieces. This design confines the influence of 
individual observations within the area covered by the enclosing 
polynomial piece. In most applications, a specific type of spline 
is preferred, the so-called "Basis spline" (B-spline), built from 



as a linear combination of basis polynomial functions called B- 
splines 

g h 

s{x,y) = CijNi^k+i(x)Mjj+i(y) 
i^-kj^-i 

where Ni^k+\{^) and Mjj+i(y) are B-splines defined on the X and 
Id knot sequences respectively. B-splines are popular for their 
computational effic iency, e.g. w ith the algorithms of Cox ( Cox 
1972 ) or de Boor ( de Boor 1972 j). For a formal definition of the 
B-spline basis see e.g. de Boor,(l978| ); |Dierckx| ( [l995) ; Prenter| 
( [2008] ). 

The second issue is solved by the ability to control the 
smoothness of the spline. The example of polynomial fitting 
shows that a good fit to the data is not the one and only goal in 
surface fitting; another, and conflicting, goal is to obtain an es- 
timate that does not display spurious fluctuations. A successful 
interpolation is, actually, a tradeoff between goodness of fit (fi- 
delity to the data) and roughness (or "wiggleness") of fit: a good 
balance between these two criteria will allow the approximation 
to not pick up too much noise (overfilling) while avoiding signal 
loss (underfilling). 

There is an extensive literature on spline interpolation and 
many algorithms and variants have been developed since their 
invention in the 1960s. Still, one can divide spline interpolation 
algorithms into two main families: those based on the so-called 
constructive approach, where the form of the spline function is 
specified in advance and the estimation problem is reduced to the 
determination of a discrete set of parameters; and those that fol- 
low a variational approach, where the approximation function is 
not known a priori, but follows from the solution of a variational 
problem, which can often be interpreted as the minimization of 
potential energy. We outline both approaches below. 



Variational approach of spline interpolation 



The variational approach ( |Wahba|[l990[ [Green & Silverm^ 
1994|) consists in minimizing the functional 



^ r 



(3) 



where the bivariate spline function / is fitted to the z(s/), / = 
0, ...,A/^ set of points in the region D where the a pproximation 
is to be made. It can be shown, e.g. ( Green & Silv erman 1994) 
that the solution is a natural spline, that is, a spline whose second 
and third derivatives are zero at the boundaries, splines obtained 
in such a way as known in the literature as smoothing splines. 
The parameter m represents the order of the derivative of / and 
Of > is a smoothing parameter controlling the tradeoff between 
fidelity to the data and roughness of the spline approximation. 

1. As Of — ^ (no smoothing), the left-hand side least squares 
estimate term dominates the roughness term on the right- 
hand side and the spline function attempts to match every 
single observation point (oscillating as required) 

2. As a — > oo (infinite smoothing), the roughness penalty term 
on the right-hand side becomes paramount and the estimate 
converges to a least squares estimate at the risk of underfil- 
ling the data. 

The most popular variational spline interpolation scheme is that 
based on the thin-plate spline (TPS) ([Pu chon 19 76t [Meinguet 
T979l|Wahba & Wendelberger||1980t |mhba„1990t [Hutchinson 
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1995| ). The TPS interpolating spline is obtained by minimizing 
an energy function of the form (Sj 



S (/, a) = Y, Ikfe) - mf + ^ Jm(g) dSi 

The most common choice of m is 2 with J2 of the form 

where the roughness function g{x, y) is given by 
g(s) = ao + aix + + ^ Xi 0(s - Si) 



(4) 



(5) 



(6) 



being the radial basis function (RBF): 0(x,j) = df ln(di) 

with euclidean distance J/ = ^/(x^^^qy^^^(y^^yi^ . The Ai are 
weighting factors. 



Constructive approach of spline interpolation 

Interpolating splines obtained through such a scheme are of- 
ten refer red to as least squares splines C Dierckx|1980[ Hayes & 
Halliday 1994; Dierckx 1995 ). For such splines, goodness of fit 
is measured through a least squares criterion, as in the variational 
approach, but smoothing is implemented in a different way: in 
the variational solution, the number and positions of knots are 
not varied, and the approximating spline is obtained by mini- 
mizing an energy function. On the other hand, in the constructive 
approach, one still tries to find the smoothest spline that is also 
closest to the observation points. But this is achieved by optimiz- 
ing the number and placement of the knots and finding the cor- 
responding coefficients c in the B- Spline basis. This is measured 
by a so-called smoothing norm G(c). Thus, the approximating 
spline arises as the minimization of 



^(/,c^) = £lk(s,-)-/(s.-)lP + t^G(c) 



(7) 



using the same notation as in ([3]). An example of knot placement 
strategy is to increase the number of knots (i.e. reduce the inter- 
knot distance) in areas where the surface to fit varies faster or 
more abruptly. By the way, we note that minimization is not ob- 
tained by increasing the degree of the spline (which is kept low, 
typically 3). 

Whatever the approach followed for obtaining a suitable in- 
terpolating spline, spline interpolation is essentially global, ap- 
proximate and deterministic, as it involves all available observa- 
tions points, makes use of smoothing and does not provide any 
estimation on interpolation errors. The interpolation can how- 
ever be made exact by setting the smoothing coefficient to zero. 
Also, for smoothing splines (variational approach) a technique 
called generalized cross-validation (GCV) ([Craven & Wahba] 
[T?78l|mhba|| 19901 [Hutchinson & Gessler||1994 ^) allows to au- 
tomatically choose, in expression (|4|), suitable parameters for a 
and m for minimizing cross-validation residuals. Otherwise, one 
can always use standard cross-validation or J ackkn ifing to opti- 
mize the choice of input parameters (see Sect. [4.3[ ). 

The most frequently used splines for interpolation are cubic 
splines, which are made of polynomials pieces of degree at 
most 3 that are twice continuously differentiable. Experience has 
shown that in most applications, using splines of higher degree 



Table 3. Spline interpolation: Pros and cons 



Spline Interpolation 



Pros 



Cons 



Able to capture both broad and detailed features 

The tradeoff between goodness and roughness of fit can be 
adjusted through smoothing 

Overall smoothness may still be too high 

Keep a tendency to oscillate 

No estimation of interpolation errors in most algorithms 

Potentially less efficient to compute than local interpolation 
algorithms 



seldom yields any advantage. As we saw earlier, splines avoid 
the pitfalls of polynomial fitting while being much more flexible, 
which allows them, despite their low degree, to capture finer- 
grained details. The method assumes the existence of measure- 
ment errors in the data and those can be handled by adjusting the 
amount of smoothing. 

On the minus side, cubic or higher degree splines are some- 
times criticized for producing an interpolation that is "too 
smooth". They also keep a tendency to oscillate (although this 
can be controlled unlike with standard polynomials). In addition, 
the final spline estimate is influenced by the number and place- 
ment of knots, which confers some arbitrariness to the method, 
depending on the approach and algorithm used. This can be a 
problem since there is, in general, no built-in mechanism for 
quantifying interpolation errors. Lastly, spline interpolation is a 
global method and performance may suffer on large datasets. A 
summary of the main strengths and weaknesses of spline inter- 
polation is given in Table [3] 

3.4. inverse distance weighting 

Inverse distance weighting (IDW) (| Shepard|1968[ ) is one of the 
oldest spatial interpolation method but also one of the most com- 
monly used. The estimated value z(xo) at a target point x is given 
by Eq. ^ where the weights At are of the form: 



;/E: 



dHxo,Xt) I ^ dKxo,Xt) 



(8) 



In the above expression, d(xQ, X/) is the distance between points 
xo and x/, JS is a. power parameter and N is the number of points 
found in some neighborhood around the target point xq. Scaling 
the weights At so that they sum to unity ensures the estimation is 
unbiased. 

The rationale behind this formula is that data points near the 
target points carry a larger weight than those further away. The 
weighting power j3 determines how fast the weights tend to zero 
as the distance J(xo,x/) increases. That is, as JS is increased, the 
predictions become more similar to the closest observations and 
peaks in the interpolation surface becomes sharper. In this sense, 
the j3 parameter controls the degree of smoothing desired in the 
interpolation. 

Power parameters between 1 and 4 are typically chosen and 
the most popular choice is JS = 2, which gives the inverse- 
distance-squared interpolator. IDW is referred to as "moving av- 
erage" when 13 = and "linear interpolation" when JS = I. 

For a more detailed di scussion o n the effect of the power 
parameter j3, see e. g. Laslett et al. ( 1987[); [Burr ough (1988); 



Brus et al.| ([1996]); [Collins & Bolstad[ ( |1996i r Another way 



to control the smoothness of the interpolation is to vary the 
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Table 4. inverse distance weighting: Pros and cons 



inverse distance weighting 



Pros 



Simple and intuitive 
Fast to compute 



Choice of interpolation parameters empirical 
The interpolation is always exact (no smoothing) 
Cons Sensitivity to outliers and sampling configuration (clustered 
and isolated points) 

No estimation of interpolation errors (deterministic) 



size of the neighborhood: increasing N yields greater smoothing. 

IDW is a local interpolation technique because the estima- 
tion at xo is based solely on observations points located in the 
neighboring region around xq and because the influence of points 
further away decreases rapidly for JS > O.li is also forced to be 
exact by design since the expression for Ai in Eq. ^ reaches the 
indeterminate form 00/ oo when the estimation takes place at the 
point Xo itself. IDW is further labeled as deterministic because 
the estimation algorithm relies purely on geometry (distances) 
and does not provide any estimate on the error made. 

IDW is popular for its simplicity, computational speed and 
ability to work on scattered data. The method also has a num- 
ber of drawbacks. One is that the choice of the JS parameter and 
the neighborhood size and shape are arbitrary, although tech- 
niques such as cross-validation or ja ckknifing can provide hints 
for tuning these parameters (see |4.3| ). Another is that there exists 
no underlying statistical model for measuring uncertainty in the 
predictions. Further, the results of the method method are sen- 
sitive to outliers and influenced by the way observations have 
been sampled. In particular, the presence of clustering can bias 
the estimation since in such cases clustered points and isolated 
points at similar but opposite distances will carry about the same 
weights. A common feature of IDW-generated interpolation sur- 
faces is the presence of spikes or pits around observation points 
since isolated points have a marked influence on the prediction 
in their vicinity. 

The original Shepard algorithm has been enhanced by sev- 
eral authors to a ddress some of the shortcomin gs lis ted above. 
See in particular Renka" f 1 9 8 8J) , Tomczak ([1998) and Lukaszyk" 
( 2004| ). One frequent extension consists in explicitly introducing 
a smoothing factor s into Eq. ([8]), which then becomes 



1 



72 



1 



(9) 



with values of s typically chosen between 1 and 5. Table [4] sum- 
marizes the main pros and cons of inverse distance weighting. 

3.5. Interpolation with radial basis functions 

We just described IDW, a simple form of interpolation on scat- 
tered data where the weighting power ascribed to a set neigh- 
boring point Xi from some point x only depends on an inverse 
squared distance function. 

We now describe a similar, but more versatile form of in- 
terpolation where the distance function is more general and ex- 
pressed in terms of a radial basis function (RBF) (Buhmann 
[20031 [Press et al.|2QQ7l ). A RBF function, or kernel is a reaT 
valued function where the value evaluated at some point xq only 



depends on the radial distance between xq and a set of points x/, 
so that 0(xo - X/) = 0(||xo - x/||). The norm usually represents the 
Euclidean distance but other types of distance functions are also 
possible. 

The idea behind RBF interpolation is to consider that the in- 
fluence of each observation on its surrounding is the same in all 
direction and well described by a RBF kernel. The interpolated 
value at a point xq is a weighted linear combination of RBF eval- 
uated on points located within a given neighborhood of size 
according to the expression 



z(xo) = ^^/z(xO = ^^/(^(||(xo- 



(10) 



/=1 



/=1 



The weights are determined by imposing that the interpolation 
be exact at all neighboring points x/, which entails the resolution 
of a linear system of N equations with N unknown weighting 
factors Ai. In some cases, it is necessary to add a low-degree 
polynomial Pk(^) of degree k to account for a trend in z(x) and 
ensure positive-definiteness of the solution. Expression (fTOl) is 
then transformed into 



z(Xo) = Pk(Xo) + Yj ^(11(^0 - X/l 



(11) 



Sometimes, an interpolation scheme based on a normalized RBF 
(NRBF) of the form 



z(xo) = Yj ^(11(^0 - x,-||)/£ 0(||(xo - x,-|| 



(12) 



is preferred to ( 10 ), although no significant evidence for superior 
performance has been found. 

The actual behavior and accuracy of RBF interpolation 
closely depends on how well the kernel matches the spatial 
distribution of the data. The most frequently used RBF kernels 
are listed in Table [5] where r = ||x - X/|| and the quantity e is the 
so-called shape parameter. The required conditions for to be 
a suitable RBF kernel have been given by |Micchelli| (| 1986) but 
the choice of the most adequate kernel for a problem at hand is 
often empirical. 

The shape parameter e contained in the Multiquadric, Inverse 
Multiquadric and Gaussian kernels influences the shape of the 
kernel function and controls the tradeoff between fitting accu- 
racy and numerical stability. A small shape parameter produces 
the most accurate results, but is always associated with a poorly 
conditioned interp olation matrix. Despite the research work of 
e.g. Hardy ( 1990), lFbIey|(fl 994) and Rippa (1999), finding the 
most suitable shape parameter is often a matter of trial and error. 
A rule of thumb is to set e to approximately the mean distance 
to the nearest neighbor. 

RBF interpolation based on the Multiquadric (MQ) kernel 
^Jl + ( er)^ is the most common. It was first introduced by ^Hardy] 
( 1971 ) as a "superpositioning of quadric surfaces" for solving a 
problem in cartography. In its review of interpolation methods 
on scattered data, Franl^ (1982 ) highlighted the good perfor- 
mance of the MQ kernel, which has, since then proven highly 
successful in many disciplines (| Hardy|19"90 l. 

RBF interpolation is fundamentally a local, exact and deter- 
ministic method. There are, however, algorithms that allow to 
introduce smoothing to better handle noise and measurement er- 
rors in the data. The method can prove highly accurate but this 
really depends on the aflftnity between the data and the kernel 
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Table 5. Most popular RBF kernels 



Semivariance 



RBF kernel 0(r) 



Expression 



Multiquadric 

Inverse Multiquadric 

Gaussian 

Thin-Plate 

Linear 

Cubic 



^j\+{erf 
l/[l+(6r)2] 
exp[-{er)^} 
ln{r) 

r 

^-3 



Table 6. radial basis functions for interpolation: Pros and cons 



radial basis functions 



Pros 



Cons 



Flexibility, thanks to various choice of kernel functions 

Relatively fast (local method), but computational speed de- 
pends on the kernel function 

Choice of kernel functions and interpolation parameters 
empirical 

The interpolation is always exact (no smoothing) 

Sensitivity to outliers and sampling configuration (clustered 
and isolated points) 

No estimation of interpolation errors (deterministic) 



function used. Also, because predictions are exact, RBF func- 
tions can be locally sensitive to outliers. As for other determinis- 
tic methods like splines or IDW, the optimal set of parameters are 
most ofte n determined by cross-validation or Jackknifing (see 
Sect. |4.3| ). Table |6] recapitulates the favorable and less favorable 
aspects of interpolation based on radial basis functions. 



3.6. Kriging 

Kriging is a spatial prediction technique initially created in the 
early 1950's by mining engineer Daniel G. Krige ( Krige 1 951| ) 
with the intent of improving ore reserve estimation in South 
Africa. But it was essentially the mathematician and geologist 
Georges Matheron who put Krige 's work a firm theor etical basis 
and developed most of the modern Kriging formalism ( Matheron 
[19621 [1963). 

Following Matheron 's work, the method has spread from 
mining to disciplines such as hydrology, meteorology or 
medicine, which triggered the creation of several Kriging 
variants. It is thus more accurate to refer to Kriging as a fam- 
ily of spatial prediction techniques instead of a single method. 
It is also essential to understand that Kriging constitutes a gen- 
eral method of interpolation that is in principle applicable to any 
discipline, such as astronomy. 

The following textbooks provide a good introduction to 
the su bj ect: Journel & Huijbregt s| (|1978|); |Isaaks & Srivastava 



1989); 



Cressie|(|199r ); Deutsch & Joumel| (|1997|); [Goovaerts 



T997l); IChiles & Delfiner|(|1999|);|Wackernagel| ( [2003] ); [Waller 
& Gotwa y (2004)); [Webster & Qliver |(|20Q7f; 

Like most of the local interpolation methods described so 
far in this article, Kriging makes use of the weighted sum ^ 
to estimate the value at a given location based on nearby ob- 
servations. But instead of computing weights based on geo- 
metrical distances only, Kriging also takes into account the 
spatial correlation existing in the data. It does so by treating ob- 
served values z(x) as random variables Z(x) varying according 




V {~) = C{0) 



Separation distance h 




Range 



Fig. 3. (a) typical variogram y(h) and its equivalent covariance 
function C(h): if the data has some sort of spatial autocorrela- 
tion, nearby (small h) Z(x) observed values will be more similar 
than more distant Z(x) values (larger /z); (b) as the separation 
distance h grows, the quantity Z(x + h) -Z(x) in expression (16) 
will tend to increase on average, but less and less as the influ- 
ence of Z(h) on Z(x + h) weakens; at some threshold distance 
h, called the range, the increase in variance becomes negligible 
and the asymptotical variance value is known as the sill 



to a spatial random process [^ In fact, Kriging assumes the un- 
derlying process has a form of second-order stationarity called 
intrinsic stationarity. Second-order stationarity is traditionally 
defined as follows: 

1. The mathematical expectation E(Z(x)) exists and does not 
depend on x 



^[Z(x)]=m, Vx 



(13) 



2. For each pair of random variable {z(x),Z(x + h)}, the co- 
variance exists and only depends on the separation vector 
h = Xj - X/, 



C(h) = e[[z(x + h) - m] [z(x) - m]}, V x 



(14) 



Kriging 's intrinsic stationarity (Matheron 1963 1965| ) is a 
slightly weaker form of second-order stationarity where the dif- 
ference Z(x + h) - Z(x) is treated as the stationary variable in- 
stead of Z(x): 

1. e[z(x + h) - Z(x)] =0, Vx (15) 

2. Var[z(x + h) - Z(x)] = e[[z(x + h) - Z(x)f] = 2y(h) (16) 

The function y(h) is called semivariance and its graph semivar- 
iogram or simply variogram. 

One reason for preferring intrinsic stationarity over sec- 
ondary stationarity is that semivariance remains valid under a 
wider range of circumstances. When covariance exists, both sta- 
tionarities are related through 



r(h) = C(0) - C(h), C(0) = y(3r[z(x)] 



(17) 



Fig. [3] shows a typical variogram along with its equivalent 
covariance function. 

Over the years about a dozen Kriging variants have been de- 
veloped. We will concentrate here on ordinary Kriging (OK), 
which is, by far, the most widely used. The description of other 
forms of Kriging can be found in the literature given at the be- 
ginning of this section. 



also called random function or stochastic process 
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ordinary Kriging is a local, exact and stochastic method. The 
set of Z(x) is assumed to be an intrinsically stationary random 
process of the form 



Z(x) = m + 6 (x) 



(18) 



The quantity e(x) is a random component drawn from a prob- 
ability distribution with mean zero and variogram 7(h) given 
by 1JT6]). The mean m = E[Z(x)] is assumed constant because 
of ( |15| ), but remains unknown. The ordinary Kriging predictor is 
given by the weighted sum 



Z(xo) = J^.-ZCx,-) 



(19) 



where the weights At are obtained by minimizing the so-called 
Kriging variance 



Var[z(xo) - Z(xo)] = e{[z(xo) - Z(xo)] } (20) 
subject to the unbiaseness condition 

e[z(xo) - Z(xo)] = = ^ E[z(Xi)] - m (21) 

The resulting system of A/^ -h 1 equations mN-\-l unknowns Ai is 
known as the ordinary Kriging equations. It is often expressed 
in matrix form as AA = b with 



A = 



r(xi,xi) r(xi,x2) 

7(X2,Xi) y(X2,X2) 



y(xN,xi) y(xN,X2) 

1 1 



r(xi,XA^) 1 

7(X2,Xa^) 1 



r(XA^,XA^) 1 

1 



(22) 



a" = [Ai A2 "'An fi] = ^ 

= [ r(xi, xo) r(x2, xo) • • • y(XN, Xq) 1 ] 

The weights Ai, along with the Lagrange multiplier yu, are ob- 
tained by inversing the A matrix 



A = A-^b 



(23) 



The main interpolation steps with ordinary Kriging can now 
be articulated: 

1. Construct an experimental variogram by computing the 
experimental semivariance y(h) for a range of separation 
distances ||h||. 

2. Fit the experimental variogram against an authorized 
variogram model. The mathematical expressions for the 
most common authorized theoretical variogram models are 
summarized in Table |7] After completion of this step, the 
y(Xi,Xj) value at any separation vector h = Xj - X/ can be 



calculated and used to compute the A matrix ( 22 ) 



3. Calculate interpolated values: derive the kriging weights At 
for each point of interest xq by solving Eq. ( 23 ) and obtain 
the kriging estimate at xq by substituting in (|19|). 



Table 7. Authorized Kriging theoretical variogram models. The 
models in this table correspond to purely isotropic Kriging. More 
elaborate formulas exist for correcting geometrical anisotropy 
through the rescaling or rotation of coordinate axes along the 
direction of major spatial continuity. 



Model 


Expression 








Pure Nugget 


y(h) = 
y(h) = Co 


Co 


>0 


h>0 


Spherical 


y(h) = Co + 
y(h) = Co + 


c 


- m) 


h<a 
h > a 


Exponential 


y(h) = Co + 


c[l - 






Gaussian 


y(h) = Co + 


c[l - 







Power 



y{h) = co + bhP 



b>0,0<p<2 



Notes. In the above expressions, co = lim/^^o7(^) is the so-called 
nugget constant that represents measurement errors or indicates a spa- 
tially discontinuous process. The quantities (co + c) and a respectively 
represent the variogram sill and range. The Pure Nugget model corre- 
sponds to absence of spatial correlation. 

Table 8. Kriging interpolation: Pros and cons 



Kriging 



Pros 



Predictions based on a spatial statistical analysis of the data 

Best linear unbiased estimator (BLUE) 

Many forms of Kriging available, applicable to various data 
configurations 

Automatically accounts for clustering and screening eff'ects; 
remains efficient in conditions of sparse data 
Can take into account variation bias toward specific direc- 
tions (anisotropy) 

Able to quantify interpolation errors (Kriging variance) 



Cons 



Overall complexity 

Requires care when modeling spatial correlation structures 

Assumptions of intrinsic stationarity may not be valid (drift) 
and be handled though an appropriate Kriging variant 

Most Kriging variants are exact (no smoothing) 

Kriging is more computationally intensive than other local 
methods 



Most of the strengths of Kriging interpolation stem from the 
use of semivariance instead of pure geometrical distances. This 
feature allows Kriging to remain efficient in condition of sparse 
data and to be less affected by clustering and screening effects 
than other methods. 

In addition, as a true stochastic method, Kriging interpola- 
tion provides a way of directly quantifying the uncertainty in 
its predictions in the form of the Kriging variance specified in 
Eq. ([20]). 



The sophistication of Kriging, on the other hand, may also be 
considered as one of its disadvantages. A thorough preliminary 
analysis of the data is required or at least strongly recommended 
prior to applying the technique (e.g. (Tukey, 1977, ). This can prove 
complex and time consuming. 

One should also bear in mind that Kriging is more computa- 
tionally intensive than the other local interpolation methods de- 
scribed in this article. The strong and weaker points of Kriging 
interpolation are highlighted in Table [5] 
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SET 01 




SET 26 



Challenge 
contains 26 Sets 



Fitting 



Input PS F field 



- Fit against elliptical Moffat PSF model 

- Validation: residual, plots, simulated data.. 



Image 01 



... c 



Image 50 



Each Set contains 50 images, 
with 500-2000 Stars each 



Fig. 4. Star Challenge simulated data 

4. Applying spatial interpolation schemes on the 
GREAT10 Star Challenge data 

In 201 1, we participated in the GREAT 10 Star Challenge compe- 
tition ( K itching et al.|201 H|2012b| ), which allowed us to evaluate 
the performance of the interpolation schemes described above: 
those based on splines, inverse distance weighting (IDW), radial 
basis functions (RBF) and ordinary Kriging. To our knowledge, 
the only reference to a similar work in the field of astronomy is 
that of B erge et al.| ( |2012| ). 

The GREATIO Star Challenge ran from December 2010 to 
September 2011 as an open, blind competition. As illustrated in 
Fig. |4] the data consisted in 26 datasets of 50 PSF fields, each 
field containing between 500 and 2000 simulated star images 
and featuring specific patterns of variation. The stars images 
were supplied as non-overlapping, randomly-scattered 48 x 48 
pixels postage stamps, altered by Gaussian noise. 

After completion of the challenge, it was revealed the stars 
had either a Moff^at (Moff'at 1969 ) or pseudo-Airy ( |Bom & Woffj 



1999 



Kuijken,2008,) pro file, with a telescope component model 



arvis et al.| ( |2008| ). Depending on the sets, specific addi- 



from 

tional eff'ects, such as Kolmogorov turbulence, were also incor- 
porated. 

The challenge itself was to predict the PSF at 1000 requested 
positions in each of the 1300 PSF fields (see Fig.[T]). 

4.1. Which model for the PSF? 

The first important step to make was to choose an appropriate 
model for the PSF. Indeed, before selecting a particular PSF in- 
terpolator, one has to decide on which type of data that interpo- 
lator will operate. 

Essentially three PSF modeling approaches have been ex- 
plored in the literature: 

1 . PSF as a combination of basis functions 

2. PSF left in pixel form 

3. PSF expressed in functional form 

To help choosing the right model for the data at hand, useful 
guidance is provided by the notions of c omplexity and sparsity, 
recently put forward by (Paulin-Henrik sson et al.||20 08, 2009). 
The complexity of a model is characterized by the amount of 
information required to represent the underlying PSF image, 
which can be expressed as the number of degrees of freedom 
(DoF) present in the model. The more sophisticated the model 
the greater the number of its DoF. Sparsity, on the other hand, 
is meant to describe how efl&ciently a model can represent the 







Prediction 






Reconstruction 




- Spatial analysis: neighbors, separation distances... 

- Spatial prediction: splines, IDW, RBF, Kriging... 

- Validation: cross-validation, Jackknifing... 

- Reconstruction from predicted parameters 

- Validation: visual inspection, quadrupole moments 



Reconstructed PSF field 

at requested non-star positions 



Fig. 5. The three- stage PSF prediction pipeline we used to com- 
pete in the Star Challenge. Elliptical Moffat profiles are fitted 
to the stars contained in the input Star Challenge PSF field; 
the model resulting parameters are then individually interpolated 
across the field at requested locations, using one of our PSF spa- 
tial interpolator. Lastly, the star images are reconstructed from 
the set of Moff'at parameters predicted in the previous stage. 



actual PSF with a limited number of DoF, that is, with a simple 
model. 

The simulated star images looked relatively simple and we 
decided that the right level of sparsity could be achieved with 
PSF in functional form (the third option). We then assumed that 
the most likely PSF profile used to create the stars was either 
Airy or Moff'at. We opted for an elliptically symmetric Moff'at 
function for its simplicity and because the stars did not show 
significant diff'raction spikes. Each star was thus assumed to have 
a light intensity distribution of the form: 



/(f) = /o[i+(^)T''- i'f^'. 



x' 


Xq 




coscf 


> sin0 




X 


Xq 


y' 






- sm(f 


> COS0 




y 


-Jc 



In the above expression, /q is the flux intensity at ^ = 0, ^ being 
the radius distance from the centroid (x^, jc) of the PSF to a spa- 
tial coordinate 



(24) 



obtained after counterclockwise rotation through an angle with 
respect to the (0, x) axis. The quantity a = FWHM [2^/^ - 1]"^^^ 
is the Mofl'at scale factor expressed in terms of the full width at 
half maximum (FWHM) of the PSF and the Mofl'at shape pa- 
rameter /3. Lastly, q is the ratio of the semi-minor axis b to the 
semi-major axis a of the isophote ellipse, given by ^ = b/a 
(1 - |e|)/(l + |e|), with 
€2 = |e| sin 20. 



ei = |e| cos 20 and 



4.2. Our PSF prediction pipeline 

The three- stage PSF prediction pipeline we used in the Star 
Challenge is sketched in Fig. [5] The purpose of the Fitting stage 
is to produce a catalog of estimated full width at half maximum 
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(FWHM) and ellipticity values of the stars found at known spa- 
tial positions within the input Star Challenge PSF image. 

In the Prediction stage, that catalog is processed by an in- 
terpolation algorithm and a catalog is produced with estimated 
FWHM and ellipticities at new positions in the same image. 
Competitors were r equired to submit th eir results in the form of 
FITS Cube images ( [Kitching et al.[ 2011 ). In the Reconstruction 
stage, each star in a PSF field is thus reconstructed using that for- 
mat from the interpolated quantities predicted in the Prediction 
stage. A more detailed description of the pipeline is given in 
Appendix [A| 



Table 9. Common diagnostic statistics for use with cross- 
validation and Jackknifing. 



Statistics Expression 



Mean error 


ME = i 

n 


[Z(x,)-Z(X[,])J 


Mean squared error 


MSE = 




Z(x,) - 


Z(x„])]' 


Mean absolute error 


MAE = 


1 

n 


Z(x,) - 


Z(X[,])| 


Mean squared deviation 


MSDR -- 




1 {[Z(x,) 


-Z(X[,])f/crf,) 


ratio 









4.3. Cross-validation andJackknifing 

The Star Challenge was a blind competition. The true answers 
being unknown, it was essential to find ways to evaluate how 
far the actual results were from the truth. To assess the fitting 
accuracy in the first stage of the pipeline we could rely some- 
what on the analysis of the residuals between observed and fit- 
ted star images. But when it came to evaluate prediction results, 
we had no such residuals to help us appraise the accuracy of the 
interpolation algorithm: we could only rely on the fitted observa- 
tions Z(x/). The use of cross-validation and Jackknifing provided 
a satisfactory solution to this problem. 



Cross-validation 

Cross-validation (CV) is a resampling technique frequently used 
in the fields of machine learning and data minin g to evaluate 
and compare the performance of predicti ve models ([S tonej 1 914\ 
|Geisser|1975l|Isaaks & Srivas tava|1989l|Browne|200Q| ). 

In the context of the Star Challenge, we used CV to both 
evaluate the performance of an interpolation method and tune 
the free parameters of the underlying interpolation models. 

As explained earlier, the deterministic interpolation methods 
(IDW, RBF, splines) we tested in the competition did not pro- 
vide any quantification of residual errors. The first three diag- 
nostic statistics mentioned in Table [9] provided a good indica- 
tion of the level of accuracy reached. This technique was use- 
ful for Kriging as well because we could directly compare the 
mean error (ME) and mean squared error (MSE) provided by 
CV: Kriging being an unbiased estimator, we expected ME to be 
nearly zero, the MSE to be close to the Kriging variance pro- 
vided by Eq. ( [2Q| and the mean squared deviation ratio (MSDR) 
to be around unity. 

CV also proved useful for tuning the free parameters of 
the model s beh ind the interpolation schemes, as mentioned in 
Appendix |A.2[ For instance, for RBF interpolation, we could 
rapidly try and discard the cubic, quintic, Gaussian and inverse 
multiquadric kernel functions. Another example was the ability 
to find the best search neighborhood size for local distance-based 
interpolation methods. 

Jackknifing 

The Jackknifing resampling technique was first proposed by 



Quenouille| ([1956 ) and further developed by |Tukey 



classical revi e w on that subject is that of Miller 
( T9821 ); [Efron & Gong] ( [T983] ); |Davis| 



Efron 



( [T958| ). A 



JT914\ See also 



1987); Tomczak 



(1998) for more general discussions on the use of CV in con- 
nection to Jackknifing. 

To Jackknife a Star Challenge PSF field image, we would 
typically split the set of input coordinates into two equally-sized 



Table 10. Final results obtained by the B-SPLINE, IDW, Kriging 
and RBF methods in the Star Challenge, sorted by decreasing 
P-f actors. The B- splines method obtained the highest P-f actor 
of the competition while the remaining four achieved the next 
highest scores. 



Rank 


PSF Interpolation method 


P 


cr2 

^ sys 


1 


Basis Spline (B-splines) 


13.29 


7.53 X 10-5 


2 


inverse distance weighting (IDW) 


13.17 


7.59 X 10-5 


3 


radial basis function (RBF) 


12.72 


7.86 X 10-5 


4 


radial basis function (RBF thin) 


12.61 


7.93 X 10-5 


5 


ordinary Kriging (OK) 


7.23 


1.38 X 10-4 



Table 11. Average values of the performance metrics E and cr 
(see Sect. [5^ over all sets, obtained by the B-SPLINE, IDW, 
Kriging and RBF methods in the Star Challenge. 



Method 


E(e) 


o-(e) 




E(R^) 


o-(R^) 


B-splines 


2.03 X 10-2 


8.57 X 10- 


-4 


1.90 X 10-1 


8.25 X 10-4 


IDW 


2.04 X 10-2 


8.69 X 10- 


-4 


1.92 X 10-1 


1.07 X 10-3 


RBF 


2.26 X 10-2 


9.73 X 10- 


-4 


1.98 X 10-1 


1.39 X 10-3 


Kriging 


3.17 X 10-2 


1.26 X 10- 


-3 


2.22 X 10-1 


2.18 X 10-3 



sets of star locations, i.e. 1000 randomly-selected star centroid 
positions from a set of 2000, one used for input and one used for 
prediction. We would then interpolate the PSF of the prediction 
set based on the PSF of the input set. 



5. Analyzing our GREAT10 Star Challenge results 

5.1. Results on the Star Challenge data 

The results obtained in the Star Challenge by the B-splines, 
IDW, Kriging, RBF and RBF-thin PSF interpolation schemes are 
shown in Table [TOl 

The B-splines method won the Star Challenge while the re- 
maining four achieved the next highest scores of the competition. 

The quantity P refers to t he so-called P-factor, specified in 
Kitching et al.|p011| |2012b| ). That P-f actor is defined so as to 
measure the average variance over all images between the esti- 
mated and true values of two key PSF attributes: its size R and 
ellipticity modulus e = |e|, estimated using second brightness 
moments computed over the reconstructed PSF images. Since 
the GREAT 10 simulated star images have either Moff'at or Airy 
profiles, R is actually an estimator of the FWHM of the stars. 
The cr^^y^ quantity is related to the P-factor by cr^^^ = 10~^/P 



and represents a total residual variance in the measurement of 
the PSF. It approximates the corresponding metric specified 
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Table 12. Performance metrics used in this article. The angle 
brackets ( and ) denote averages and "stdeV the standard devi- 
ation. These statistics are calculated over the = 1000 stars in 
each of the 50 images of each set. 



PSF attribute 


Metrics 


PSF Ellipticity 


E{e) = ^{(ecaic - etruef) 1 2 




a{e) = stdev {Ccaic - etme) / ^ / ^^N 


PSF Size 


E(R^)= ^{{Rll,-Rluef)l{Rlue) 




cr{R') = stdev {R^,^ - RIJ / {RlJ / VA^ 



in|Amara & Refregier] ( |2008t ; [Paulin-Henriksson et al.| ( [2QQ8 
[2009] ): 



5.2. Performance metrics 

In this article, we do not rely on the P-factor as a metric for 
assessing the performance of our methods, for the following rea- 
sons. Firstly, the P-factor is specific to the Star Challenge and is 
not mentioned anywhere else in the literature on PSF interpo- 
lation. Secondly, we are really interested in knowing the indi- 
vidual accuracy of ellipticity and size but P only appraises the 
combined performance of these quantities. 

To assess the performance of an interpolator, we calculate 
instead the root mean squared error (RMSE) and standard error 
on the mean (SEM) of the residuals bet ween true and calculated 
values of PSF ellipticity and size. As in |Paulin-Henriksson et al. 
2008| ); |Kitching et al.|f2012b| ), we adopt the ellipticity modulus 



J77 



e = (ef-\- and size squared as respective measures of 
ellipticity and size, and define the corresponding residuals as 

S(e) = ecalc - etrue. = Rlalc " ^lue 

As regards PSF ellipticity, we adopt as performance metrics 
E{e) = RMSE(S(e)/2\ cr(e) = SEM(S(e)/2) 



while for PSF size, we evaluate 



E(R^) 



RMSE(S(R^))/{Rl^ 



CT(R^) 



SEM(S(R^))/{Rl^^} 



where the angle brackets ( and ) denote averaging. The factor 2 
in the expressions of E(e) and cr(e) arises because ellipticity has 
two components. We calculate these metrics over the N = 1000 
stars in each of the 50 images of each set. 

The quantity E provides a measure of the global accuracy of 
the interpolator (bias and precision combined) while cr provides 
insights into the variance of the residuals. The exact expressions 
for these performance metrics are given in Table [12 



5.3. Analysis of the Star Challenge results 

The performance metrics of B -splines, IDW, RBF and Kriging 
are given in Table [TT] The results of RBF and RBF-thin being 
very close, we no longer distinguish these two interpolators in 
the reminder of this paper and only mention them collectively as 
RBF. 

Since a detailed analysis of the Star Challenge results of 
B -splines, IDW, RBF and Kriging as already been performed in 
[Kitching et al.| ( |2012b| ), a similar analysis would be redundant 
here. We do have, however, a couple of observations to make. 



based on the metrics in Tables [TOl and [TT] 

We observe that the global cr^^^ variance of the most suc- 
cessful in terpolation method is of the order of 10""^. As demon- 
strated in Amara & Refregier" (2008); Paulin-He nriksson et al. 
( 2008) and confirmed by Kitching et al. ( 2009), future large sur- 
veys will need to constrain the total variance in the systematic 
errors to cr^^^ < 10"^, which corresponds to E(e) < 10"^ and 
E(R^) < 10"^. The Star Challenge results thus tend to suggest 
that a ~ 10 improvement in E(e) and a ~ 100 improvement in 
E(R^) are still required for achieving that goal. 

Secondly, since we have been using a three-stage pipeline 
as described in Sect. 4.2 each stage, fitting, interpolation and 
reconstruction, can potentially contribute to the final error in 
size and ellipticity. Investigations following the publication of 
the true size and ellipticity values after the end of the Star 
Challenge, have led us to conclude fitting was actually the main 
performance limiting factor, not the interpolation or reconstruc- 
tion process. 

Also, the comparatively lower performance of Kriging is 
not related to the interpolation algorithm itself, but is actually 
due to an inadequate fitting setup, that was subsequently fixed 
for B-splines, IDW and RBF submissions. 

As the main goal of this article is to assess the respective 
merits of the interpolation methods, we wish to eliminate all in- 
accuracies related to fitting. To achieve this, we use instead of 
our fitted ellipticity and FWHM estimates at known positions, 
the true input values, kindly supplied to us by the GREAT 10 
team. We interpolate these true input values at the expected tar- 
get positions and then measure the error made by the interpo- 
lators. We present and analyze the corresponding results in the 
next section. 



6. Comparing PSF spatial interpolation schemes 

The results presented in this section are based on true FWHM 
and ellipticity values at known positions in the Star Challenge 
PSF images. We are thus confident that error statistics we ob- 
tained truly reflect the performance of the PSF interpolation 
methods and are not influenced in any way by inaccuracies due 
to the fitting of our PSF model or to the image reconstruction 
processes. 

We compare below the respective performance of five PSF 
spatial interpolation schemes: 

- The four interpolation schemes introduced in Sect. |3] that 
competed in the Star Challenge: B-splines, IDW, RBF and 
ordinary Kriging. 

- An additional scheme, labeled Polyfit, which corresponds to 
a least- squares bivariate polynomial fit of the PSF, similar to 



that typically used in weak lensing studies (see Sect. 3.1 



The metric values reflecting the average accuracy E and error 
on the mean cr for these five interpolation schemes are given in 
TableflSl 



6.1. Overall performance 

The E and cr metrics on ellipticity and size after interpolation 
with all five methods are given in Table [13] These results lead to 
the following observations: 
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Table 13. Average values of the performance metrics E and cr 



(see Sect. 5.2) over all sets, based on the true input ellipticities 



and sizes. 


Method 


E{e) 




a{e) 












RBF 


L73x 10" 


-2 


7.18 X 10- 


-4 


4.58 X 10- 


-3 


1.44 X 10- 


-4 


IDW 


1.78X 10- 


-2 


7.24 X 10- 


-4 


9.25 X 10- 


-3 


2.91 X 10- 


-4 


Kriging 


1.82 X 10- 


-2 


7.09 X 10- 


-4 


6.47 X 10- 


-3 


2.03 X 10- 


-4 


Polyfit 


2.29 X 10- 


-2 


7.52 X 10- 


-4 


5.16 X 10- 


-3 


1.62 X 10- 


-4 


B- splines 


2.33 X 10- 


-2 


7.39 X 10- 


-4 


6.45 X 10- 


-3 


2.04 X 10- 


-4 



If we compare Tables [13] and [12] we observe a ~ 100-fold 
decrease of E(R^) for all interpolators. This confirms that the 
fitting of PSF sizes was the main limitation that prevented 
us from reaching better results in the Star Challenge. In 
comparison, the fitting of ellipticities was quite good. 

if we now concentrate on Table [13] we find that the RBF in- 
terpolation scheme based on the use of radial basis functions, 
has the highest accuracy and smallest error of the mean, both 
on size and ellipticity. We also observe that E(e) ~ 10"^ 
whereas E(R^) ~ 10"^. This is because these statistics are av- 
erag es oy er 26 image sets with diff'erent characteristics (see 
Sect. 



10-2 and ~ 10" 



. 6.2 ). In reality, E(e) varies between 
whereas E(R^) ~ 10"^ regardless of the sets 



If we consider E(e) in particular, two groups emerge. 
The first one contains RBF, IDW and Kriging, with 
E(e) < 1.8 X IQ-^. The interpolators of the second group, 
B-splines and Polyfit with E(e) > 2.3 x IQ-^. We will see 
below that this is essentially due to the better accuracy of 
local interpolators on turbulent sets as regards ellipticity. If 
we focus on E(R^), the distinction between local and global 
interpolation schemes disappears. RBF and Polyfit stand out 
from the others with E(R^) ^ 5 x 10"^. We also note that 
the accuracy of IDW on size is worse by several order of 
magnitude. 

The errors on the mean cr(e) and cr(R^) are of the order of 
lO"'* for all five schemes. As was observed for E(e), we 
find that the local interpolators RBF, IDW and Kriging reach 
better cr(e) values compared to global ones, B-splines and 
Polyfit. As for cr(R^), the best values are reached by RBF 
and Polyfit, similarly to what was found for E(R^). 



6.2. Influence of PSF features simulated in the images 



As exp lained in the Star Challenge result paper ( [Kitching et al. 
2012bl ), the image sets were designed to simulate typical PSF 
features found in real astronomical images. Each set implements 
a unique combination of characteristics against which a method 
can be evaluated. All 50 images within a set share the same broad 
features, but diff'er in the way star positions, sizes and ellipticities 
are spatially distributed across the field. 

The various PSF features tracked in the images are outlined 
below: 

- PSF model: the fiducial PSF model includes a static and a 
dynamic component. The static component is based on a 
pseudo-A iry (Bo rn & Wolf||1999[ |Kupen][2QQ8] ) or Moff'at 
(Moffat 1969J functional form, depending on the set. The 



1000 2000 3000 4000 





1000 2000 



Fig. 6. A Star Challenge non-turbulent PSF (left) compared with 
a turbulent PSF (right). Each "whisker" represents the ampli- 
tude |e| of the ellipticity of stars. The largest whisker in the left 
hand side image corresponds to an ellipticity of 0.16. The right 
hand side image has a maximum ellipticity of 0.37. The ellip- 
ticity plots have respectively been made from the first PSF field 
image of sets 8 and 14. 



dynamic component made the ellipticity and size of in- 
dividual stars vary spatially across the image of the PSF field. 

Star size: the images from most of the sets share the same 
"fiducial" 3-pixel FWHM, except sets 6, 14, 26 and sets 7, 
15 whose images have respectively a FWHM of 1.5 and 6 
pixels. 

Masking: sets 2, 10, 22 have a 4-fold symmetric mask 
denoted as "-h" and sets 3, 11, 23 have a 6-fold mask sym- 
bolized by a "^". Images from all other sets are unmasked. 

Number of stars: the majority of images contain 1000 stars. 
Sets 4, 12, 24 are denser, with 2000 stars, whereas sets 5, 
13, 25 are sparser, with only 1500 stars. 

Kolmogorov turbulence (KM): an attempt was made on 
sets 9 to 15, 17, 19 and 21 to simulate the effect of atmo- 
spheric turbulence by including a Kolmogorov spectrum 
in PSF ellipt icity. See |Heymans et al.| pOll] ); [Kitching 
et al. (2012b) for the details. Fig. |6] shows side by side a 
non-turbulent and a turbulent PSF. 



- Telescope effect: a deterministic component was included 
in sets 17, 19 and 21 to reproduce effects from the 
telescope optics on the PSF ellipticity and size, essent ially 
primary astigmatism, primary defocus and coma (B om &| 
[mlf|r999| ), based on the model of |Jarvis & Jam|p004i r 

In order to determine how interpolation schemes are affected by 
the aforementioned PSF characteristics, we have computed for 
each of them the performance metrics per individual image sets. 
We have plotted the metrics E{e) and E(R^) in Figs, [t] and [s] We 
analyze the results below. 

- Influence of turbulence The PSF feature that affects 
the interpolation methods the most is the presence of a 
Kolmogorov (KM) turbulence in ellipticity. Fig. [6] illustrates 
how erratic the spatial variation pattern of ellipticity can be- 
come in the presence of KM turbulence. It is clear that a 
prediction algorithm faces a much more challenging task 
on turbulent images than on images with more regular PSF 
patterns. To highlight this, we have averaged in Tables [14 
and [15] the metrics E and cr separately over turbulent and 
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Fig. 7. Accuracy per set for the RBF and IDW interpolation methods. Sets with pseudo-Airy and Moffat are respectively colored 
in different shades of blue and orange, as specified in the legend at the bottom left of the Figure. The various patterns contained in 
the left hand- side legend indicate the types of physical PSF features simulated in the images. The values on the bars correspond to 
logi^{\IE{e)) and logio(l/E(R^)) depending on the quantity plotted, so the taller the bar the greater the corresponding accuracy. 



Table 14. Non-turbulent sets: average values of E and cr. 



Table 15. Turbulent sets: average values of E and cr. 



Method 


E(e) 


o-(e) 


E(R^) 




Method 


E(e) 


o-(e) 


E(R^) 


o-(R^) 


RBF 


8.26 X 10-4 


3.60 X 10-5 


4.59 X 10-3 


1.45 X 10-4 


RBF 


4.36 X 10-2 


1.81 X 10-3 


4.57 X 10-3 


1.44 X 10-4 


IDW 


1.28 X 10-3 


5.67 X 10-5 


9.37 X 10-3 


2.95 X 10-4 


IDW 


4.42 X 10-2 


1.79 X 10-3 


9.05 X 10-3 


2.85 X 10-4 


Kriging 


7.06 X 10-4 


3.16 X 10-5 


3.57 X 10-3 


1.13 X 10-4 


Kriging 


4.61 X 10-2 


1.79 X 10-3 


1.11 X 10-2 


3.49 X 10-4 


Polyfit 


8.37 X 10-4 


3.73 X 10-5 


5.23 X 10-3 


1.64 X 10-4 


Polyfit 


5.82 X 10-2 


1.89 X 10-3 


5.04 X 10-3 


1.58 X 10-4 


B -splines 


6.28 X 10-4 


2.80 X 10-5 


6.53 X 10-3 


2.06 X 10-4 


B- splines 


5.97 X 10-2 


1.88 X 10-3 


6.31 X 10-3 


1.99 X 10-4 



non-turbulent sets. Comparing these two tables shows that 
E(e) ~ 10"^ and cr(e) ~ 10"^ on non-turbulent sets, whereas 



E(e) ~ 10 ^ and cr(e) ~ 10 ^ on turbulent sets. This rep- 
resents a ~ 100-fold decrease in accuracy and error on the 
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Fig. 8. Accuracy per set for Kriging, polynomial fitting and B-splines. The legend is the same as that used in Fig]?] The values 
on the bars correspond to log\i^{\IE{e)) and logio(l/E(R^)) depending on the quantity plotted, so the taller the bar the greater the 
corresponding accuracy. 



mean. This eff'ect can also be seen on the plots of E(e) in 
Figs. [T] for and [8] 

We also observe that, on sets without a KM spectrum, 
all interpolators evaluated in this paper typically reach 



cr'ly^ ~ 10"^ already beyond the ~ 10"^ goal of next- 
generation space-based weak lensing surveys. In contrast, 
sets with turbulent PSF do not match that requirement, with 
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Fig. 9. An illustration of how the various interpolation methods studied in this article handled a turbulent PSF, which in this case 
is the first image of set 9. The true ellipticities are plotted on the upper-left comer of the Figure and the remaining plots show the 
predictions of each methods. The largest whisker in the upper-left corner plot corresponds to an ellipticity of 0.38. 



The similarities between E{e) and (j{e) values for RBF, IDW 



and Kriging in Table 15 suggest these methods behave more 
or less the same when confronted with turbulent elliptici- 
ties. To check this, we have have compiled in Fig. |9] the 
true ellipticity pattern of turbulent set 9 along with the ac- 
tual predictions of the same pattern by all five interpolators. 
The same metrics for Polyfit and B-splines show that these 
global methods are even more handicapped by the presence 
of a KM spectrum. 

Turbulence also makes the spatial distribution of the FWHM 
less predictable and the methods are aff'ected to various 
degrees: RBF, IDW, Polyfit and B- Spline are little influenced 
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and 



with similar E(E^) and cr(R^) values in Tables [14| and _ 
on the corresponding plots in Figs. [7] and [5] The only one 
really impacted is Kriging. 

Influ ence of star density Following the discussion of 
Sect. |3.2[ we expect the local interpolation methods to 



be more accurate than global ones on images with higher 
star density but see their performance degrade on sparser 
star fields. Such local interpolators base their predictions 
on observations found in local neighborhoods and should 
therefore be in position to take advantage of any additional 
available. On the other hand, they should suffer compar- 
atively more from insufl&cient sampling when the data is 
too sparse. This is indeed what we observe in the IDW plot 



Fig. [7] but the conclusion is less clear regarding RBF and 
Kriging: these schemes are indeed more accurate on denser 
sets when it comes to estimate the FWHM but the reverse 
is seen concerning ellipticities (plots Figs. [7] and [8]). This is 
mostly noticeable on non-turbulent sets and may be caused 
by some overfilling taking place on denser ellipticity fields. 
This phenomenon does not occur on FWHM possibly be- 
cause the FWHM spatial distribution is generally smoother 
than that of ellipticities in the Star Challenge dataset. 
We also expect the global interpolators B-splines and Polyfit 
to be little affected by difference in star density, since such 
schemes attempt to find a regression surface that takes all 
available data into account but at the same time minimize 
the overall bias through the least squares criterion. Such a 
surface tends to smooth out small-scale variations, mostly 
capturing broad features in the image. The corresponding 
predictions may become less accurate but, on the other 
hand, remain little influenced by sampling differences. This 
is exactly what we find in the plots of Polyfit and B-splines 
Fig. [8] The smoothness of the prediction surfaces of Polyfit 
and B-splines compared to that of local interpolators is 
clearly noticeable in Fig.|9] 

Influence of the PSF model and size, masking and tele- 
scope efl'ects Although some interpolators do better than 
others on a particular PSF models, each individual scheme 
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perform equally well on Moffat and Airy images. This can 
be seen, for example, on fiducial sets 1 and 8 where the 
error statistics on Moff'at or Airy sets are almost identical 
for a given method. The same can be said of the influence 
of FWHM, masking and telescope efl'ects. We also observe 
star size to have a negligible impact on E(e) for all meth- 
ods, but we clearly see that E(R^) significantly increases 
(resp. decreases) for star fields with smaller (resp. larger) 
FWHM. Finally, all methods reach a slightly higher accu- 
racy on masked images, especially with 6-fold masks. 



weights are calculated, which leads to a loss in accuracy in 
the corresponding regions. Kriging variants with ability to 
correct such a drift, like Universal Kriging, would probably 
achieve better results. Also, our implementation of Kriging 
for the Star Challenge assumes spatial isotropy, even though 
experimental variograms for ellipticity on non-turbulent 
sets also show evidence of geometric anisotropy, A more 
sophisticated implementation could have corrected these 
eff'ects by rescaling and rotating coordinate axes along the 
direction of maximum spatial continuity. 



6.3. Results from individual interpolation methods 

- Interpolation with radial basis functions (RBF) As shown 
in our previous discussion, the RBF interpolation scheme 
is the overall winner of our evaluation. According to our 
benchmarks, ellipticity patterns were best estimated by a 
linear kernel function, whereas a thin-plate kernel was 
more eff'ective on FWHM values. A neighborhood size 
between 30 and 40 stars was used. Refer to Sect. 13.51 and 
Table [5] for a description of RBF interpolation and the 
definitions of these kernels. That combination of linear and 
thin-plate kernels yields very competitive error statistics 
on both turbulent and non-turbulent sets: Tables [14] and 
15 as well as plots Fig. [7] show RBF is the most accurate 
on turbulent sets whereas its results on non-turbulent 
sets are the second best behind ordinary Kriging. The 
possibility of selecting the most suitable kernel for a given 
PSF patterns is a very attractive feature of RBF interpolation. 

- Inverse distance w eight ed interpolation (IDW) The IDW 

methods (see Sect. |3.4| ) obtains the second best average 
E{e) behind RBF over all sets as seen in Table [13] It does 
so thanks to very competitive E{e) results on turbulent sets, 
just behind RBF (Table [15]). But IDW's estimates of the 
FWHM on non-turbulent sets are by far the worst of all 
five interpolation algorithms , bo th on ellipticity and size 
(Table [14]). As found in Sect. |6.2| IDW looks quite sensitive 
to variations in star density. In fact, we observe that IDW 
underperforms on star fields with low-density and smaller 
FWHM (sets 5, 6, 13, 14, 25, 26). We were unable to find 
a setup that significantly improves that level of accuracy, 
which suggests the method has difficulty coping with such 
constraints in density and size. 

All in all, IDW performs quite well overall, knowing it is 
based on a very simple interpolation algorithm, with fewer 
adjustable parameters than RBF or ordinary Kriging (see 
Sect. [34]). 



- Interpolation with ordinary Kriging (OK) Despite its 
reputation of best interpolator on spatially- scattered data, 
ordinary Kriging, introduced in Sect. 3.6 arrives only third 
behind RBF and IDW when considering error statistics in 
Table 13 As see in shown in Table 14 and plots Fig. [8 
Kriging 's estimates on non-perturbed sets are the best of al 
five methods. But this cannot compensate for its relatively 
poor performance on estimating the FWHM on turbulent 
sets, as shown in the value of E{R^) in Table 15 The reason 
for this is probably related to the significant spatial drift 
of the FWHM values across the image. The condition of 
intrinsic stationarity required by ordinary Kriging is no 
longer fulfilled in some areas, especially near the edges 
of the image. As a result, we were forced to reduce the 
size of the search neighborhood over which the Kriging 



Polynomial fitting (Polyfit) The results of Polyfit are of 
particular interest since polynomial fitting is currently the 
method of choice for modeling spatial variations of a PSF 
in lensing studies (see Sects. [2land |3.1| ). polynomial fitting 
performs relatively well on non-turbulent sets with E{e ) 
and E{R^) statistics fairly close to those of RBF (Table 14). 
However, the corresponding statistics on turbulent sets are 
significantly worse that those achieved by local methods , as 
seen in Table [15] This confirms the conclusion of Sect. 13.11 
whereby polynomials have difficulty coping with small 
or rapid variations found in a PSF pattern. Low-degree 
polynomials generally produce satisfactory result but tend 
to underfit the data, which leads to suboptimal accuracy. 
The resulting interpolation surfaces are characteristically 
smooth, as clearly observed in the Polyfit plot of Fig. [9] 
The Star Challenge images without KM power spectrum are 
smooth enough for Polyfit to approach the accuracy of RBF 
and ordinary Kriging. These results were obtained with a 
fifth-degree polynomial, higher degrees degrading the fit. 



Interpolation with Basis splines (B-splines) Polynomial 
splines are generally considered superior for interpolation 
than simple polynomials as explained in Sect. |3.3| and we 
would have expected B-splines to achieve better results than 
Polyfit on the Star Challenge data. But this is not reflected in 
the averaged results from Tables [13] The level of accuracy 
reached by both interpolators is nevertheless of the same 
order. 

As seen in Table [14] and plots Fig. [8j the ellipticity esti- 
mates from B-splines are superior to those of Polyfit on 
non-turbulent sets and of similar accuracy on turbulent 
ones. This tends to confirm the better ability of splines to 
capture small-scale and rapid variations in the data than 
polynomials. The results show, however, errors E(R^) on 
the FWHM much larger for B-splines than for Polyfit, 
which explains the relative lower performance compared to 
Polyfit. The FWHM spatial distribution being overall quite 
smooth in the Star Challenge images, this result suggests 
polynomials may be better suited than splines for modeling 
smoothly-varying patterns of variation. Combining both 
schemes may also be worth investigating. 



7. Conclusions 

The GREAT 10 Star Challenge gave us the opportunity to eval- 
uate several interpolation methods on spatially-varying PSF 
fields: 

- Two global, approximate and deterministic spatial interpola- 
tion schemes: polynomial fitting (Polyfit) and basis splines 
(B-splines). 
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- Two local, exact and deterministic techniques relying on 
inverse distance weighting (IDW) and radial basis functions 
(RBF). 

- An implementation of ordinary Kriging, a local, exact and 
stochastic spatial prediction method, frequently used in 
Geostatistics and environmental sciences. 

We used a three-stage PSF estimation pipeline, which we 
described in Sect. 4j2 and Appendix |A| Elliptical Moffat pro- 
files were fitted to the stars contained in each Star Challenge 
image and then estimated and reconstructed at new positions in 
the same image using one of the five interpolation schemes listed 
above. 

That approach proved quite successful since it allowed us 
to win the GREAT 10 Star Challenge. We were, however, disap- 
pointed by the relatively high cr^y^ values reached, of the order 
of 10""^, i.e., still far from the cr^^^ < 10"^ target demanded by 
future large weak lensing surveys. The lack of accuracy could 
be traced to the suboptimal fitting of Airy PSF profiles by our 
pipeline and not to a deficiency in the PSF interpolation meth- 
ods. However, this issue made it difficult to unambiguously con- 
clude on the level of accuracy of individual interpolation algo- 
rithms, which is the main objective of this article. 

In order to measure errors purely due to interpolation and 
only these, we used the true input ellipticity and FWHM cat- 
alog for the input Star Challenge images instead of our fitted 
estimates for these quantities. We also chose new metrics, better 
suited than the P-factor for assessing estimates on ellipticity and 
size. The results are summarized in Tables 13 14 and 15 along 
with the corresponding plots in Figs.[7]and]8pWe highlight our 
main conclusions below. 



- Table 13 shows the overall E(e) and E(R^) errors to be of 
the order of 10"^ and 10"^ respectively. Fig [l4j however 
indicates that E(e) ~ 10""^ and E(R^) ~ 10"^on images 
devoid of Kolmogorov turbulence, to be compared with the 
E(e) < 10"^ and E(R^) < 10"^ estimated requirements of 
future next-generation surveys. Although the Star Challenge 
PSF fields lack realism in certain aspects, this suggests 
that the best methods, RBF, IDW and OK, may already be 
suitable for space-based surveys where turbulence is absent. 

- All interpolation methods see their accuracy drastically de- 
graded in images where atmospheric turbulence eff'ects have 
been simulated, with E(e) and E(R^) errors increased by a 
factor of ~ 100. The better performance on turbulent images 
of RBF, IDW and OK compared to Poly fit and B -splines 
in the GREAT 10 Star Challenge, suggests local methods 
may be able to better cope with turbulence than global ones. 
We note, however, that these results are only valid for the 
specific turbulence model used in the simulations and would 
have to be confirmed on real data. 

- After turbulence, the factors influencing results the most 
are the density of stars and their size. As far as density is 
concerned, local methods are more impacted than global 
ones and generally improve their estimates on denser sets 
much more than global methods. A similar conclusion is 
reached concerning local methods as far as PSF size is 
concerned. However, the results suggest both global and 
local methods have difl&culty coping with objects smaller 
than the fiducial FWHM of 3 pixels. Among all methods, 
IDW suff'ered the most from sparse star fields with small 
FWHM. 



- The RBF interpolator proved the most accurate, reaching 
the best results on both turbulent and non-turbulent sets. 
The use of kernel functions brings additional versatility 
compared to a simpler interpolator like IDW, while avoiding 
the complexity of Kriging. The selection of the most suitable 
kernel function and associated parameters can be greatly 
simplified by the use of cross-validation or Jackknifing. 
These techniques, as shown in Sect. 4.3 can prove very 
helpful to tune the run-time parameters of an interpolation 
schemes and evaluate the accuracy of its results. 

- Despite its simplicity, the IDW interpolation method 
obtained better than expected results, outperforming polyno- 
mials and splines in the simulations. Fast and easy to tune, it 
could potentially constitute a simple alternative/complement 
to polynomials before trying more elaborate interpolation 
schemes such as Kriging or RBF. 

- Ordinary Kriging is, in our opinion, potentially the most 
accurate method as shown especially by its results on 
non-turbulent images. However, the FWHM spatial distri- 
butions in the Star Challenge have a significant spatial drift 
that the standard ordinary Kriging algorithm is unable to 
correct. Another Kriging variant such as Universal Kriging 
would possibly have proved more accurate. It remains that 
Kriging, because of its sophistication, is more difficult and 
time consuming to operate than the other interpolators we 
evaluated. 

- Overall, our analysis of the Star Challenge results sug- 
gests local interpolators should be preferred over global ones 
based on splines and polynomials. However, one should bear 
in mind that (1) these results are based on simulated data 
where star images are isolated, bright enough and well sam- 
pled; (2) the spatial variation of the PSF as simulated in 
GREAT 10 may tend to favor local interpolators over global 
ones. We strongly believe, nevertheless, that local interpo- 
lation schemes for PSF interpolation have the potential to 
improve the accuracy of existing and future ground-based 
lensing surveys and deserve to be investigated further. 
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Appendix A: Our PSF prediction pipeline (PSFPP) 

A.1. Overview 

The PSF prediction pipeline used in the Star Challenge is out- 
lined in Fig. [5] A PSF field is fed into the pipeline and goes 
through three processing stages: 



1. Fitting stage: the Moff'at PSF model described in Sect. 4.1 
is fitted to each star at known position (Xc,yc) in the PSF 
field image. A catalog is produced, containing a set of fitted 
parameters |(Xc,}^c); (^i. ^2); 0, for each star. Instead 
of an out-the-box minimizer, we employ a custom minimizer 
we developed at the EPFL Laboratory of astrophysics and 
well suited to fitting faint and noisy images like those 
frequently found in weak lensing. The minimizer uses an 
"adaptive cyclic coordinate descent algorithm" that find a 
local minimum with the lowest of the residuals. That 
same minimizer has also been used in the version of the gfit 
shear measurement method that competed in the GREAT 10 
Galaxy Challenge ( Kitching et al.||2QT2a| ). The star images 
processed by the minimizer are 16 x 16-pixel cutouts instead 
of the original 48 x 48-pixel postage stamps. 

2. Prediction stage: 

- First, an analysis of the spatial distribution of each pa- 
rameter across the image is performed. In particular, all 
separation distance s between star s are recorded in the 
form of KD-trees ( |Bentley|1975| ) for efficiently finding 
the nearest neighboring stars located within a given sep- 
aration distance ||h||. 

- Second, a spatial prediction scheme is applied to 
estimate the values Z^(xJ,jJ) of the parameter p at 
asked locations (xj, jj), given the fitted parameter values 
^pi^i^yd obtained in the previous stage. One of the four 
methods described in section [3] is applied here. 

3. Reconstruction stage: All stars in a PSF field are recon- 
structed based on the elliptical Moff'at model described in 
Sect. |4.1| but using the parameters predicted for that star dur- 
ing the Prediction stage. 

A.2. Pipeline implementation and configuration 

The pipeline code is written in Python, a programming language 
known for its power, flexibility and short development cycle. The 
usual standard Python libraries are used, notably: NumPy, SciPy, 
PyFITS and matplotlib. SciPy is the standard scientific library 
for Python. Most of its functions are thin Python wrappers on 
top of fortran, C and C-h-h functions. SciPy takes advantage of 
installed optimized libraries such as LAPACK (Linear Algebra 
PACKage) library ( Anderson et al.|199Q] ). We employ th e cro ss- 
validation and Jackknifing resampling techniques (see 4.3) to 
tune the run-time parameters for the interpolation schemes and 
evaluate the accuracy of the results. We highlight below a few 
aspects related to the implementation of the methods. 



RBF: we use the rbf () interpolation function available in 
the SciPy interpolate module. The number of parameters to 
tune is greater compared to IDW: a kernel function chosen 
among those listed in Table |5| the neighborhood search 
size N\ a shape parameter e for the multiquadric, inverse 
multiquadric and Gaussian kernels; and a last paramet er fo r 
controlling the smoothness of the interpolation (see 3.5 ). 
Only the linear , thin-plate and multiquadric kernels gave 
stable enough predictions. Choosing 25 < A/^ < 30 and 
disabling smoothing (i.e. use exact interpolation) yielded the 
best cross-validation and Jackknifing results for the chosen 
kernels. 



splines: we have selected the bisplrepO and bisplevO 
bivariate B- spline interpolation functions provided by the 
SciPy interpolate module. These functions are Python 
wrappers on top of the fortran FITPACK package (Dierckx] 
1995). The underlying algorithms follow the cons tructive 
approach for spline interpolation described in 33 and are 
specified in D ierckx| ( |1980j ). The main parameters aff'ecting 
the interpolation are the degree p of the spline, the number 
of knots N and a smoothing factor s. We have fixed p io ?> 
but let the algorithm automatically set N and s. 



Kriging: we have used our own custom-developed Python 
code of ordinary Kriging (see Sect. 3.6 ). The Kriging used in 
the Star Challenge and in this article is isotropic and does not 
implement any spatial anisotropy or drift correction scheme. 
The accuracy of the ordinary Kriging interpolation scheme 
was influenced by the following set of parameters: 

- The interpolation range, i.e. the range in pixels used for 
interpolation. Depending on the images, we chose a cir- 
cular area with a radius between 700 and 1000 pixels 
from the center of the 4800 x 4800 PSF field. 

- Lag distance h in pixels. We used values in the range 
100 < /z < 300 depending on the image and the PSF 
model parameter to estimate. 



- The number of observations N in Eq. ( 19 ) to include in 
the neighborhood: we used 5 < < 20 depending on the 
image star density. 

- Tolerance distance A/z (pixels) and angle A^ considered 
when locating neighboring observations. As a rule of 
thumb, we selected A/z » h/l and A^ = 22.5°. 

- A theoretical variogram model such as those listed 
in Table [7] The experimental variograms were fitted 
using the Levenherg-Marquardt least-squares least sq 
routine from the SciPy optimize module. The program 
dynamically selected the theoretical variogram models 
and parameters that produced the best fit. 

The Poly fit code is based on the leastsqO function from 
the SciPy optimize Python module. A least- squares fit to a 
bivariate polynomial of degree 5 gave the best estimates. 



- IDW: the code for Inverse Distance Weighted interpolation 
is written in Python, based on Eq. ^ with weighting factors 
specified by (|8]). The free parameters are the power factor 
P and the neighborhood size N (see |3.4[ ). A configuration 
with = 2, with 5 < N < 15 depending on the density 
of stars in images gives the best results according to our tests. 
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